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1. Introduction 

Molecular dynamics simulation is among the most important and widely used 
tools in the study of molecular systems, providing fundamental insights into molec- 
ular mechanisms at a level of detail unattainable by experimental methods AT87, 
ILea961 IFS9 6 , Sch02j. Usage of molecular dynamics spans a diverse array of fields, 
from physics and chemistry, to molecular and cellular biology, to engineering and 
materials science. Due to their size and complexity, simulations of large systems 
such as biological macromolecules (DNA, RNA, proteins, carbohydrates, and lipids) 
are typically performed under a classical mechanics representation. A critical re- 
quirement of such simulations is ergodicity, or convergence in the limit to the equi- 
librium (typically canonical) Boltzmann measure /i(dq, dp) = Z((3)~ 1 e~ l3H ( <i ' p * > dqdp 
Although ergodicity is commonly assumed, recently {CS081 showed that all com- 
monly used deterministic dynamics methods for simulating the canonical (constant- 
temperature) ensemble fail to be ergodic. They also showed that introduction of a 
stochastic hybrid Monte Carlo (HMC) correction guarantees ergodicity; however, 
HMC scales poorly with system dimension and is rarely used for macromolecules. 
CS08 also show empirically that more commonly used stochastic Langevin dynam- 
ics |Pas 94 appear to exhibit ergodic behavior, but were unable to provide rigorous 
proof. 

The key difficulty in applying existing arguments MSH02] is the appearance of 
singularities in the potential U(q). Most modern molecular mechanics forcefields 
[PCC+951 IBBO+83I I.TTR88] take the form 

C/(q)= K 1 (r-r*f+ £ K 2 {6 - 6*f + £ ^[1 + cos(n<j> - 7)] 

bonds angles dihedrals 




Here the first three terms involve bond length, angle, and torsional energies; being 
bounded, these are easily handled. The difficulty arises from the non-covalent 
electrostatic and Van der Waals forces, the latter modeled by a Lennard-Jones 
potential, which give rise to singularities as two atoms in the system approach each 
other at close range. 
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In this paper we establish the ergodicity of Langevin dynamics for simple two- 
particle system involving a Lennard- Jones type potential. To the best of our knowl- 
edge, this is the first such result for a system operating under this type of potential. 
Moreover we show that the dynamics are geometrically ergodic (have a spectral 
gap) and converge at a geometric rate. Geometric ergodicity is sufficient to im- 
ply existence of a central limit theorem for ergodic averages of functions h with 
E M (|/i| 2+(5 ) < oo for some 8 > |IL71j . and also implies the existence of an ex- 
act sampling scheme |Ken04j . although the latter need not be practical. Loosely, 
proving an ergodic result has two central ingredients. One provides continuity 
of the transition densities in total variation norm which ensures that transitions 
from nearby points behave similarly enough probabilistically which provides the 
basic mechanism of the probabilistic mixing/coupling. This is often expressed in 
a minorization condition (see Section [6]). The other ingredient gives control of ex- 
cursions towards infinity which ensures the existence of a stationary measure and 
guarantees that sufficient probabilistic mixing for an exponential convergence rate. 
The difficulty in a problem is typically one or the other. 

In Section [2j we will see that in the current setting, basic existence of a station- 
ary measure is trivial since the standard Gibbs measure built from the energy is 
invariant. Uniqueness of the stationary distribution follows from now standard re- 
sults on hypoelliptic diffusion. However the control necessary to give a convergence 
rate has previously been elusive. Our approach follows the established method of 
demonstrating the existence of a Lyapunov function and associated small set; how- 
ever, construction of the Lyapunov function in the presence of a singular potential 
is non-trivial and our approach constitutes one of the major innovations of this 
paper. In many ways it builds on ideas in [HM09 and more obliquely is related to 
the ideas in |RBT02| . In both cases, time averaging of the instantaneous energy 
dissipation rate is used to build a Lyapunov function. We use similar ideas here. 
In a nutshell, as in |HM09| the technique consists of casting the behavior of the 
system as the energy heads to infinity as a problem with order one energy contain- 
ing a small parameter equal to one over the original systems energy. Then classical 
stochastic averaging techniques are used to build a Lyapunov function. Though 
the solution is related to |HM09| . the presentation of difficulties is quite different. 
We conclude by briefly discussing the challenges of extending our results to larger 
systems and the case of harmonically growing potential which are not covered by 
our theorems. 



We will use (Q,P) to denote the position and momentum of a particle in a deter- 
ministic system, and (q, p) for the corresponding stochastic system. 

Consider the two-particle Hamiltonian system (Q, P) = ((Qi, Q2), {Pi, P2)) with 
Hamiltonian 



2. A Model Problem 



ffo(Q.P) 



and interaction potential 



K 



(1) 




k=l 
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where au € K with ai,a^ > 0, and oi\ > ■ ■ ■ > a if. We assume that ai > 2 and 
uk < (otherwise no singularity exists). 
The dynamics of this system are given by 

• dH a ■ dH t ■ io 

Qi = n — Pi = — — for i = l,2. 

If we force the system with a noise whose magnitude is scaled to balance dissi- 
pation so as to place the system at temperature T then we arrive at the system of 
coupled SDEs 

dqi = pi dt for i = 1 , 2 
(2) dp x = -U'(qi ~ q 2 ) dt ~ 1Pl dt + adWi(t) 

dpi = U'(qi -q 2 )dt- -fp 2 dt + adW 2 (t) 
where cr 2 = 27T. Define 

§= f {(Pl,9l,P2,92) : 9i 7^ 92}- 

We will prove in Corollary |5.4| that, if the initial conditions are in S, then with 
probability one there exists a unique strong solution to ([2]) which is global in time 
and stays in S. 

We define the Markov semigroup by Vt<p(p, q) = E( Piq ^(p t , q t ) where E( p . q ) is 
the expected value starting from (p, q) . This semigroup has a generator £ given 

by 

r ^ dof >p dHo _d_ _ dHo _d_ _ p .JL + T— 
dpi dqi dqi dpi 1Pl dp, 7 dpf ' 



i=l,2 

Additionally Vt induces a dual action on cr-finite measures p, by acting on the left: 
[iVt- A measure /io is a stationary measure of Vt if PoPt — Po- In our setting, this 
is equivalent to asking that CqPq = where /x (dp,dq) = p (p,q)dpdq. 
It is a simple calculation to see that if 



Pa 



for some C, then £q/°o(p,cj) = 0. Hence with this choice of po, po as defined 
above is a stationary measure. However this measure is not normalizable to make 
a probability measure since it is only cr-finite. This stems from the fact that the 
Hamiltonian is translationally invariant in q. To rectify his problem we will move 
to "center of mass" coordinates. 

2.1. Reduction to Center of Mass Coordinates. Let q = \{q\ q 2 ), P — 

\(pi-P2), q= 5(91+92),^= \{pi+P2), W= §(Wi-W 2 ) andS= \(Wi + W 2 ). 
Then 

dq t = p t dt 

dpt = - Jpt dt + adB t 

(3) 

dq t = Pt dt 

dp t = -U'(2q t ) dt - jp t dt + adW t 

In these new coordinates, the system is described by variables (<?, p) tracking the 
position and momentum of the center of mass, and variables (q, q) tracking the 
relative position and momentum of the particles within the center of mass frame. 
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Q 




Figure 1. Level sets of H(Q,P) — r\ for r\ equals 1 (in blue), 2 
(in green), and 4 (in red) where H(Q, P) = \P 2 + Q 4 + ^Q~ 2 . 



This change of coordinates simplifies our problem to two uncoupled Hamiltonian 
sub-problems. The center of mass (q,p), has Hamiltonian 

which is the Hamiltonian of a free ID particle, with corresponding invariant measure 
given by a Gaussian (for momentum p) times ID Lebesgue measure (for position 
q). Note that p follows an Ornstein-Uhlenbeck process and hence converges expo- 
nentially quickly to its (Gaussian) stationary measure. The position q will diffuse 
through space like ID Brownian motion and hence converges to Lebesgue measure. 

The remaining two variables (q,p) are also a Hamiltonian system with Hamil- 
tonian 

(4) H(q,p) d =^ + U(2q). 

which is a single particle interacting with a potential U that is attractive towards 
the origin at large distances, and repulsive at short distance. So (q,p) will have an 
invariant probability measure. However convergence of this system is more subtle; it 
possesses two difficulties stemming from the structure of the potential. First, since 
U(R) is singular at points, a strictly positive density does not exist everywhere in 
space. Second, there is no immediate candidate for a Lyapunov function. Over- 
coming this second obstacle will prove more difficult and will occupy the bulk of 
this paper. 

3. Reduced system : main results 

We now turn to the study of the two-dimensional Hamiltonian system described 
by Q. In this section we state the principal results on this reduced system. 

Consider the two-dimensional deterministic Hamiltonian system with Hamilton- 
ian 

H{Q,P)^~ + U{Q) 

and hence dynamics 

Qt=gp(Qt,Pt) = Pt and P t = - — (Q uPt ) = -Ui(Q t ). 

This system has only closed orbits, which lie completely in the upper half plane 
denoted by H = {(Q,P) G M 2 : Q > 0} provided the initial points lie in H. 
To see this observe that when |(Q,P)| — > oo, H(Q,P) is well approximated by 
|P 2 + aiQ ai + axQ aK which clearly has level sets that are closed, homotopically 
a circle, and lie completely in the upper half plane. (See Figure [T]). 
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Addition of balanced noise and dissipation yields the associated stochastic system 
of interest. Namely, for positive temperature T, friction 7 and noise standard 
deviation a = ^/2"/T, we have 

dq t = p t dt 

(5) 

d Pt = -U'{q t ) dt - 1Pt dt + a dW t 

This Markov process has generator 

c _ dH d dH d d + d 2 

dp dq dq dp dp dp 2 



and as in the previous section a straightforward calculation shows that /1* (dpxdq) = 
P*{liP)dpdq is a stationary measure with 

p*(q,p) = Ce- H ^» T , 

since C* p* — 0. Unlike the stationary measure of the unreduced system, this 
measure can be normalized and made into a probability measure (since H is no 
longer translationally invariant) for an appropriate choice of C. 

In fact is the unique stationary measure of the system. To see this first 
observe that ^ is hypoelliptic and hence any weak solution to = must 
locally have a smooth density with respect to Lebesgue measure. Since ji* has an 
everywhere positive density with respect to Lebesgue measure it must therefore be 
the only stationary measure, since any stationary measure can be decomposed into 
its ergodic components all of which must have disjoint support. Uniqueness of the 
stationary measure is also a by-product of the exponential convergence given in 



Theorem 3.1 which is our principal interest here. 

To state this convergence result we need a distance between probability measures 
appropriate for our setting. To this end, for any (3 > we define for <fi : H — > R the 
weighted supremum-norm 

U\\ ^ sup |#z,p)| e -^>> 
( q ,p)eB 

and the weighted total-variation norm on signed measure v (with z/(H) = 0) by 

dcf 



|M|/3= sup / 4>dv. 
^■.\\4>h<iJm 

When f) = this is just the standard total- variation norm. We define A^^(H) to 
be the set of probability measures /ioni with J„ exp(/3H)dp < 00. Then we have 
the following convergence result 

Theorem 3.1. There exists /3o > f; so that for any /3 G (0,/3o) there exist positive 
constants C and 5 such that for any two probability measures (i,±, /I2 G Aip(M) 

\\HlPt - toPth < Ce-^Wfii -M2II/3 

In particular the system has a unique invariant measure, which necessarily coincides 
with /i* defined above, and to which the distribution of (qt,Pt) converges. 



Remark 3.2. We will see later that /3q from Theorem 3.1 equals 2A*/T where 



A» is the constant defined in (12) which depends only on the choice of ct\ in the 



potential U but always satisfies A* G (1,2). 
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Our proof of Theorem |3.1| will follow the now standard approach of establishing the 
existence of an appropriate "small set" and Lyapunov function |MT93| . Similar to 
MSH02], we will use a control argument coupled with hypo-ellipticity to establish 
the existence of a small set. While this is rather standard, the technique used to 
prove the existence of a Lyapunov function is less standard and one of the central 
contributions of this note. 

4. The Lyapunov function: Overview 

4.1. Heuristics and motivating discussion. We wish to control motion out to 
infinity (|(g,p)| — > oo) as well as in the neighborhood of the singularity (q — > 0). A 
standard route to obtaining such control is to find a Lyapunov function V : H — > 
(0, oo) so that if V t — V(q t ,pt) then 

(6) dV t < -cVtdt + Cdt + dM t 

for some martingale M t and positive constants c, C and such that H < cqV for 
some positive cq. 

The first reasonable choice might be to try the Hamiltonian H(q,p) itself. Setting 
H t — H(q t ,pt) we see that 

9 

(7) dH t = - 1P 2 dt + ~ dt + a Pt dW t . 

However the function (q,p) i— > p 2 is not bounded from below by (q,p) i— > H(q,p) 
since the two functions are not comparable. This prevents us from obtaining the 
desired bound. If U{R) only has positive powers of R which are greater or equal 
to two, this deficiency can be partially overcome by considering Vt — H t — 7oPt9t- 
Then by picking 70 small enough, we can ensure that -H < V < cH asp 2 + q 2 — > 00 
and that CV is bounded from above by a constant times —V + C for some C > 0. 
Hence V is comparable to H but satisfies the desired Lyapunov-function inequality 
©. See |MSH02j for more on using this trick in this context. 

Unfortunately, this simple trick does not seem to work in the presence of a sin- 
gular repulsive term, as it does not address the explosion of H as q approaches 
though it does rule out explosion as — > 00. This is necessary since the po- 
tential and hence the transition density behaves poorly near this point and uniform 
estimates are not easy (if possible) to obtain. Eventually we will find an appropri- 
ate function ^ so that V = H + ^ works; to do so we need to better understand 
the dynamics. 

With this in mind, we return to Q and take a closer look at the dynamics. 
Looking at the right hand side, it is true that p 2 is not comparable to H(q,p) at 
every given point (q,p) in phase space. Yet if we really believe that the system does 
not blowup the — p 2 must lead to some "dissipation" of energy when the energy is 
large. 

Since we are only interested in preventing the blowup of H , we need only consider 
the equation when the energy is large. However, when the energy is large we know 
a lot about the dynamics. To leading order it will follow the deterministic dynamics 
with stochastic fluctuations of lower order. At high energy, the highest order part 
of the potential U dominates. 

For discussion purposes, in this section, we will assume that 

(8) U{R) = ai|i?| Ql +a 2 \R\ a2 
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for some a\,a2 > and ct\ > 2 > > a.2- Later in Section [5j we will return 
to the general problem. It will be convenient to introduce the following family of 
potentials indexed by a parameter A > 1 

U(R;X)= ai \R\ ai + a 2 \R\ a2 X 2( ^~ 1) . 

Setting A = 1 yields the original problem. The advantage to considering this family 
of potentials is that U (R; A) has the following homogeneity property for £ > 

U(£^R;X) = £ 2 U{R;£X) 

which will lead to all of the scaling properties mentioned subsequently. 

The orbits of the deterministic trajectories are given by the solution set of 
H(Q,P;X) — \P 2 + U(Q;X) = r] for a given energy level r\ > 0. This locus is 
topologically equivalent to a circle and hence setting 

(9) g(Q; 77, A) = y/2(r) - U(Q\ A)), 

the orbit is given by the set {(Q, g(Q; 77, A)), (Q, -g(Q; 77, A)) : Q € [Q-(tj, X),Q + (tj, A)]} 
where Q+{rj, A) and Q- (77, A) are respectively the largest and smallest positive roots 
of 77 — U(Q; A) = 0. 

We wish to determine the average of P 2 around the deterministic orbits, as this 
will give a good idea of the dissipation asymptotically as the energy becomes large. 
We see that averaging P 2 around this deterministic trajectory gives 



/■<2+(n>A) 

(P 2 )( V ,X) = 2 Q (Q; V ,X)dQ; 
jQ-(.nA) 

and similarly that the period t(t], A) of this orbit can be expressed as 
r(t ?J A)=2/ — -dQ 

To make the idea of "large energy" more precise we consider the rescaling of 

2 

phase space defined by the mapping (q,p) H> (lP,£"i Q) for a scale factor £ > 0. 
Under this map the associated energy will essentially scale by a factor £ 2 for large 
£. However this is not exactly correct since the other terms in the potential do 
not scale in the same fashion. However by changing the value of A we can relate 
a scaled Hamiltonian exactly with an unsealed Hamiltonian having A = j, that is, 

since H(Q,P;X) = \P 2 + U{Q; A), we see that H(£P,£^Q; A) = £ 2 H{Q, P; IX). In 
other words, the scaled system behaves exactly like the unsealed system at a higher 
energy. If we define the average 

(10) ^(P 2 )(77,A)^ f (P 2 )(77,A)/r(77,A) 

then we also see that A(P 2 )(hr),X) = hA{P 2 )(r],h^X). 

Summarizing, the average of P 2 around the deterministic orbit with energy hrj 
and A = 1 is the same as h times the average of P 2 around the deterministic orbit 
with energy 77 and X = for the simplified potential in this section. We will see 
that this will hold for sufficiently large energy in the more general setting discussed 
in Section [5] If we define, 

(11) A( v ) d = c A(P 2 ) (15775) 
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Figure 2. Level sets of H (Q, P; A) = 1 for A equals 1 (blue) and 
2.15 (green) where H(Q, P; A) = \P 2 +Q i + ^- ) \-' i Q- 2 The dashed 
line is the level set of \P 2 + Q 4 = 1 with P > to which the level 
sets of H (Q, P; A) = 1 converge as A —> oo. 



then ^4(P 2 )(r;, 1) = f]A(i]). Further observe that as A — ► oo, the level sets under 
potential U(R;X) converge (Figure p|, and „4(P 2 )(1;A) converges to a positive 
constant A* . As we will see later 

(12) C(l~^)^Q _ 2ax 



where = a x ai . Notice that A* is independent of the value of a\ and since 
a x > 2, observe that A* e (1,2). 

Now since at high energy (i.e. 77 1), A(P 2 )(r), 1) = TjA(rj) ps t^A*, it is 
reasonable to approximate ([T]) by 



(13) di/ t w -iKH t dt + °^dt + a J ^^dWt 

when H t 3> 1. In making this approximation, we are not claiming that there is av- 
eraging in the traditional asymptotic sense. Namely, that there is a small parameter 
going to zero which causes the system to speed up and hence the instantaneous ef- 
fect on the system is increasing in the limit that of the averaged parameter. Rather, 
at high energy the system acts (after rescaling) increasingly like a system with order 
one energy and a rescaled parameter A. The rescaling also leads to a rescaling of 
time so that an order one time in the rescaled system represents an increasingly 
short time in the original system. Hence in a short instance of time at high energy, 
one sees the effect of many rotations of the system, making the averaged quantities 
just calculated a good approximation. 

4.2. Numerical explorations. The plots in Figure [3] compare the trajectory of 



the energy predicted by ( 13 1 and the energy trajectory obtained from a numerical 
simulation of |5]) when both were started from the same initial high energy level. 
The model potential given in ([8]) was used with a\ £ {2, 4, 6} and a 2 = —12. Similar 
comparisons with ui equal to -2 and -4 were also made with nearly identical plots 
confirming essentially no dependence on a 2 as predicted by our asymptotic theory. 

Our theory only applies to the two cases a% € {4, 6}. In these cases the agreement 
with the theory, shown with the dashed line, is quite good. One can see a small 
scale wiggle in the numerical curves. This is the effect of the periodic orbit. As the 
scaling theory predicts, the effect decreases as the energy increases since the scaling 




Figure 3. The first three plots are semi- log plots of energy versus 
time for the dynamics using the potential in Q with a\ equal to 2 
(uper most curve), 4 (middle curve), and 6 (lower most curve). The 
solid lines arc numerical simulations and the dashed lines are the 



theoretical prediction made by ( 13 1 



shows that period and the size of the fluctuations go to zero as the energy increases. 
When a\ = 2 our theory does not apply. Nonetheless, the trend given by dotted 
line is followed. However one sees that period and amplitude of the fluctuation is 
not going to zero which is also constant with the scaling arguments predictions. The 
possibility of extending our theory to this boundary case is discussed in Section [8] 

4.3. Using the Poisson equation: Overview. Informed by the preceding dis- 
cussion, we return to the idea of constructing a Lyapunov function V of the form 
V = H + ty, where \t is introduced to handle the singularity in H . To see how 
the average value of —p 2 comes into this framework and make rigorous the intu- 



ition from (13), we will make use of the "Poisson equation" associated with the 
deterministic dynamics, which controls the fluctuation of A(P 2 ). 

We begin by introducing the Liouville operator % associated with the determin- 
istic dynamics and defined by 

(14) (n<P)(Q,P;X) d = l ^(Q,P;\)^(Q,P)-^(Q,P- 1 X)^(Q,P) 

for <f> : R 2 — > K. Next we define "J/ to be the solution to the Poisson equation 

(15) m)(Q, P; A) - 1 (P 2 - A(P 2 )(Q, P; A)) 
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where A(P 2 )(rj, A) is the averaging operator defined by ( 10 ) discussed in Section 4.1 
and we have introduced the slight abuse of notation 

(16) A{P 2 ){Q, P; A) d =U(P 2 )(P (Q, P; A); A) . 

Since P n- P 2 is a smooth function, we will see that ( |T5| has a smooth solution. 

As mentioned, we will choose the Lyapunov function V(q,p) = H(q,p) + m(q,p). 
To understand why this is a good choice, first observe that the generator C of ^ 
can be written as 

8 g 2 d 2 

c = n - 1P dp + Ydp- 2 - 



Letting V t = V(q t ,p t ) we see that since * solves ([15]), A(P 2 ){H t ;l) = A(H t )H t 
and HH = 0, we have that 

2 

(17) dV t = - 1 k{H t )H t dt + U —dt + aptdW t 

dm a 2 d 2 m dm 



The first line of the right-hand side essentially coincides with (13); therefore, to 
realize our goal we need to show that this choice of V is comparable to H and 
that the remaining terms on the right-hand side are negligible at large energies. To 
accomplish these remaining goals we will exploit the scaling of the equations. To 
this end we make the following definition: 

Definition 1. We say that a real valued function 0(Q,P;A) scales like the energy 
to the power n if 

(18) 0(£^Q,eP;\)=f 2 ^(Q,P;£\) 
for all P and Q and I > 1. 

Observe that, as expected, H scales like the energy to the power 1 since 

(19) H(e^Q,ep;\) = e 2 H(Q 1 p-,ex) 

Now (Q, P; A) n- P 2 — A(P 2 )(Q, P; A) also scales like the energy to the power one 
since 

(ep) 2 - A{p 2 )(e^Q,£P;\) = e 2 (p 2 -^(p 2 )(g,p ; a) 



We will see that the solution m to |I5| ) scales like the energy to the power + | , 
which is less than one. Since \m\ grows sublinearly in the energy H as H — > oo, 
V = H + m is comparable to H for large values of H . That is, for any e > there 
exists a positive C e so that 

(1 - e)H - C e < V < (1 + e)H + C t . 

The fact that *5> scales like the energy to the power ^ + § implies that 

dm 2dm f qp, 

dp d P y e ^ £ 

v (<? ' p;1) = £ dp^W^- 
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These observations are made more formal in Proposition [531 In light of this scaling, 



each of the "dt" terms in the second line of ( 17 1 can be bounded by eH t + C e for any 
e > and some matching constant C e . From the discussion around the definition 
of A we see that for some positive Ao and Co, A(H t )H t > 2AoH t — Cq. Hence all 
of the "dt" can be bounded by —AoH t + C. 

Combining all of these considerations we obtain that for some positive c and C, 

—EVt < -cEV t + C 
dt 

and thus V is a good Lyapunov function. 

This is the basic line of argument we will follow in the rest of text. We will 
need to establish the scaling and smoothness properties assumed in the preceding 
calculations. We will also have to modify the argument sightly since while the 



scaling dictated by (20) controls the terms as the Hamiltonian goes to infinity it 
presents a problem as the value of the Hamiltonian goes to zero. Namely unless 
we modify the solution f in a neighborhood of zero the term ^ will explode as 
H -> 0. 

5. The Lyapunov function: Rigorous construction 

We now repeat the argument outlined in the preceding section while dealing 
with various technical points neglected previously and addressing the two main 
deficiencies. First, we will now consider general potentials U of the form given in 
(JlJ and not just the special form given in Section 4.1 Second, the careful reader 



might have noticed that while the scaling given in ( 20 ) ensures that the terms in 
question go to zero at large energy it also implies that they could (depending on 
the a's) blow up at small energies. Fixing this second point will require a small 
modification in the argument. 

In keeping with the discussion of the previous section we now introduce a family 
of potentials U(R; A) defined by 

(21) U(R; A) d = f g + ai \Rr + £ a k \B\<"-\«% " 1} 

k=2 

where the a k and a k satisfy the same constraints as in Q . Since ao can be changed 
without changing the dynamics, we can assume that inr U(R; 1) = 0. Notice that 
U(R;1) gives the same potential as originally defined in Q. Also notice that 
U{£^R; A) = e 2 U{R;£X) for i,R > 0. Or rather, in the language of Definition [2] 
U scales as the energy to the power one. 

For any A > 1, consider the one-dimensional Hamiltonian system with Hamil- 
tonian 

H(Q,P;X) d ^^- + U(Q;X) 

which scales like the energy to the power one (which if it must if our terminology 
is at all reasonable). Associated with this Hamiltonian are the dynamics given by 

Qt=gp(Qt,Pt)=Pt and P t = - — (Q t ,P t ) = -U'(Q t ;X). 

This deterministic system has only closed orbits which lie completely in the upper 
half plane denoted by H = {(Q, P) e K 2 : y > 0} provided (Q ,P ) € H. 
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We are interested in the above system with the addition of balanced noise and 
dissipation. Namely, for temperature T > 0, friction 7 > and noise variance 
o = \/2"fT, we have 

dq t = p t dt 

(22) 

d Pt = -U'(q t ; A) dt - yp t dt + a dW t 

The following is the main technical result concerning the existence of a Lyapunov 
function. The proof follows the argument outlined in Section [4] and will be given 
in the subsequent sections. 

Theorem 5.1. There exists a C°° function V : H — >• [0, 00) so that for any 5 > 
there exists positive constants c and C with 

(1 - 5)H(q,p) -C<V<(1 + 6)H(q,p) + C 

for all (q,p) G EI and writing Vt = V(qt,pt) then one has 

(23) dV t < -7(A* - 8)V t dt + Cdt + dM t 

for some L 2 -martingale M t with quadratic variation: 

2 



(M) t < {a 2 + 6) / (V 3 + C)ds. 
Jo 

Theorem |5.1| has the following three immediate corollaries. 

Corollary 5.2. Let re = inf{t > : (qt,Pt) $ H}; then for all (po, qo) € H, th = 00 
almost surely. And hence the local in time solutions to (22 1 for (j>q, qo) € H provided 
by the standard theory are in fact global in time solutions contained in H for all 
time with probability one. 

Proof of Corollary\5.S\ For any R > H(q ,p ), on the set Dr = {(q,p) G K 2 : 



H(q,p) < R} the coefficients of (22) are Lipschitz. Hence up to the exit time from 
Dr one has the existence and uniqueness of strong solutions of (22 1. Defining the 
stopping time tr — inf{s : (q s ,p s ) Djj}, we have constructed strong solutions of 
(22 1 for t < Tfl. Since R was arbitrary and the solutions don't depend on R up to 
time tr we need only show that for any t > 0, 

p( n {r k <t})=o. 

{ke1:k>H(q ,p )} 

Now using |23] ), it is straightforward to see that Ei? t < CoEV t < 00 for any t > 
and some cq > 0. This combined with the Doob Inequality for Martingales yields 
P(tr < t) < P(swp s < t H s > R) < C(t)/R 2 . Since Y^=i k ~ 2 is summable the 
result follows from the Borel-Cantelli lemma. □ 

For any /3 > 0, we define ^\h) = exp(ph) and set ^ = <f>^(H t ). Then 
from (23) and the fact that a 2 = 27T, it is clear that the following corollary holds. 

Corollary 5.3. For any j3 € (0,2A*/T) there exist positive n and K so that 

E[$^ } S |J- S ] < e- Kt $i /3) + Kt 

where T s is the a-algebra generated by {{q r ,Pr) '■ T < s}. Also recall from (12) that 
A„ G (1,2) since cti > 2 and hence (3 = ^ is always allowed. 

For the next Corollary, we momentarily return to considering the unreduced 
system (p t ,q t ) = (qi(t) , q 2 {t) , Pi(t) , p 2 (t)) defined by 
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Corollary 5.4. Let t§ = inf {t > : (q t , p t ) ^ S}; then for all (qo, Po) € S, r§ = oo 

almost surely. And hence the local in time solutions to ([2| for (qo,Po) & § provided 
by the standard theory are in fact global in time solutions contained in § for all 
time with probability one. 

Proof of Corollary \5.4\ The existence of a global solution (qt,Pt) to ^ is equiva- 
lent the existence of a global solution {qt,Pt,Qt,Pt) which solve s p |. The existence 



of a global solution to (q~t,Pt) follows directly from Corollary 5.2 Since the pair 
(qt,Pt) is independent of (q~t,pt), we can consider it alone. Since it has no singu- 
larity, the existence of a global solution for (q~t,Pt) can be found in many places 
including |MSH02j . □ 

5.1. The Deterministic Trajectories. Since the Hamiltonian H(Q,P; A) is con- 
served for the deterministic dynamics, the deterministic systems will stay on one 
of the connected components of the level set 

(24) £( V . / \) d = l {(Q,P): ~ + U(Q; A) = 77} 

for r\ = H(Q , Pq', A). For fixed A, as rj — > oo the set £(rj, A) approaches the set 

(25) {(Q,P): ~ + ai \Qr+a K \ 2 ^-V\Q\ a « = 77} 



which is of the form discussed in Section 4.1 Observe if we define the scaling map 

2 

Sg: (Q,P) *-> (£ ai Q,iP) then from the scaling of the terms in H, one has 

£(r 1 ,X) = S e (£(r 2 f 1 ,£X)). 



Hence the as A — > oo, one also has that the set £(rj, A) converges to the set in (25 1. 

Hence for either fixed A and large rj or fixed r\ and large A, £ (77, A) is nomotopic to 
a circle and can be written as the union of two segments which are graphs over the P 
axis (see Figure [T]). For r\ large enough, increasing A does not destroy this property. 
That is to say there exists an 77* so that if 77 > 7/* and g(Q; 77, A) = ^2(77 — U (Q; A)) 
then 

{ (Q, q(Q; v, A)) , (Q, -q(Q; v, A)) : Q e [Q-iv, A), Q+fo, A)]} = £(r,, A) 

and 5(77, A) consists of a single connected component which is topologically equiv- 
alent to a circle for all A > 1. Here 

Q-(n, A) is the smallest solution of rj — U(Q; A) = 

Q + (?y, A) is the largest solution of 77 — U(Q; A) = 



Lastly as discussed in Section 4.1 for fixed 77, as A — > 00 we have that £n,X) 
converges pointwise to 

p2 



{(Q,P)- -y+ailQr =77andQ>0} 



5.2. Averaging along trajectories. We now consider the value of certain func- 
tions averaged along trajectories of the deterministic system. Because we will only 
need the averages of functions at large 77 or A, we will restrict to functions <j> £ Ho 
where 

H = : H x [1, 00) -> K : tj>(Q, P; A) = if H(Q, P; A) < 77,} 
where 77* is the constant from Section [5. 1| 
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Since we will only be averaging functions <f> G Ho, we need only to average 
along E(r), A) which are connected and topologically a circle as discussed in the last 
section. 

On such £ observe that if t(t], A) is the period of the orbit then P t — g(Qi) for 
t G [0,t/2] and P t = -g(Q t ) for t £ [r/2,r] if (Qo,P ) = (Q-,0). And we define 

(0>(r?,A)= / ( "' \{Q u P t ;X)dt 

— (rj,A) 







<l>(Qt,Q{Qt);X)dt+ / <f>(Qt,-Q(Qt);\)dt 

J t(^,A) 

Q + (,) < A) 0(g, 77, A); A) + <£(g, -g(q; n 7 A); A) 



-dq 



The period t can be obtained by t(t],X) = (1) (77, A) . We define the "averaging 
operator" A by 

Note that if *(Q,P;A) = ip(H(Q,P; A); A) then _A(*)(77,A) = ^(77, A) since the 
energy is constant on an orbit. 

5.3. Scaling across trajectories. We will need the following refined notion of 
the scaling of a function which allows for the scaling to hold only for sufficiently 
large energy. 

Definition 2. We say that the real valued function 4>{Q, P; A) eventually scales 



like the energy to the K if there exists a constant such that ( 18 ) holds for all 
(Q, P) with H(Q, P; A) > ^ and A > 1. 

Proposition 5.5. If <f> G Ho (eventually) scales like the energy to the power n then: 

{4>){Q, P\ A) (eventually) scales like the energy to the power n-\ , 

a\ 2 

A{(j))(Q, P; X) (eventually) scales like the energy to the power n , 

def) 1 
A) (eventually) scales like the energy to the power k — — , 

dd) 1 

— — (Q,P;A) (eventually) scales like the energy to the power k . 

oQ ' Cil 

In particular, the period of the determinisitc orbits r(r/, A) stasifies r(hr}, A) = 
h a i 2 T(n,h^X) for n > 77* and A > 1, and hence; the function (Q,P;X) H> 
t(H(Q, P; X), X) eventually scales like the energy to the power ~^ — \- 

Proof. Recalling the definition of Q± from (26) and g from ([9]), notice that 

Q ± {hn,X) = h^Q ± (n,hiX) 

1 i 1 

h 2 g(q] 77, h* A) = g(h a i q; hr}. A) 
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1 1 j 

We begin with (0) and A(4>). By the assumption on <f), <p(h^p, h"i q; A) = h K cj)(q,p; /1 5 A), 
so 

,,w, u rQ+i^A) ^ ( fe A)) . A) + ^ ( _ ( h xj)X) ^ 

{4>)(hr],X) = — — dq 

jQ-(h v ,\) Q{q,hr),X) 

Q+( "'^ A) <j>{h^q, hig(q; V, hh A)); A) + fth^q, -hi g(g; g, A)); A) 
Q-()7,h3A) /i 3 ~^T g(q;r),h?X) 

= h K+ ^~H^)( v ,hix) 

Since (g,p) i— > 1 scales like the energy to the power, we see that 

r(hri,X) = (l)(hr), A) = h^~*{l)(r), h^X) = h^~K(r),h?\) 

and hence A((j>)(hr], A) = h K A(<f))(r), h^X). 

We now turn to the statements about the derivatives. If we assume that 6 



(eventually) scales like the energy to the power k, differentiating both sides of ( 18 ) 



e^(£^Q,eP;X)=f K ^(Q,P;£X) 
e^^Q,iP;X)=f^(Q,P;iX) 

and hence and |^ (eventually) scale as the energy to the power k— | and k— ^ 
respectively. □ 

5.4. Ne w Definition of A and its Control. In the heuristic discussion of Sec- 
tion 4.1 we defined A in terms of the average of P 2 at energy one with an appro- 
priately scaled parameter A. We were free to use energy one as our reference energy 
since the simple potential used in Section [4~l] has simply connected level-sets at all 
energies. For the more complicated potential defined in ( |21| , this is only guaranteed 
for energy r\ greater than the 77* introduced in Section |5.1[ Hence we define 

(27) A^R^MiH 4 ) for 77 > 77* 

[0 for 77 < 77* 

With this definition we have that 

(28) A p2 )(v, !) > VHV) for all 77 > 



and that A(r?) —> A* as 77 — > 00. Here A* is as defined in (12) and does not depend 
on our choice of 77* . In fact the following proposition which we will need later follows 
directly from these facts 

Lemma 5.6. For any 5 > there exists a C$ > so that 
(29) (A, - d)r] - C s < r)A(r)) < (A, + 5) v + C s 



for all 77 > 0. 
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5.5. The associated Poisson equation. Recalling the definition of the Liouville 



operator % from (14 1, given a $ € H we will wish to solve 
(30) Wil = $ - A(<f>) . 

Notice that by subtracting A(Q>) we ensure that the right-hand side is in the kernel 



of the adjoint of %. This is necessary to ensure solvability of equation (30 1 



Proposition 5.7. Let $ e H n C fc (H) for some k > 0. Then (|30|) has a solution 
in H nC* fc (H)- 

Proof. We recall that since we have restricted to <f> which are zero for energies less 
than rji,,] hence, we need only solve the equation in regions where the level sets are 
simply connected and homotopically equivalent to a circle. 

In other words, all of the characteristic curves of the operator H are closed curves. 
Hence one can construct a solution by simply specifying values of VP transverse to 
the characteristic curves, say $ = on (<7,p) with p = and H(q,p) > rj*. The 
remaining values of * are obtained by integrating $ — A($>) around the orbit in, 
say, the clockwise direction. This is well defined since by construction $ — A($>) 
has total integral around an orbit and hence the values of ^ from above and 
below p — will match. Since for rj > rj^ , we know the orbits are simple loops the 
integration is straightforward along the lines of the previous section on averaging. 

The smoothness and local boundedness are direct consequences of the same 
properties of $ and the fact that we have excluded the possible singularity at (q,p) 
with H(q,p) = by requiring that $ be zero in a neighborhood of the point. □ 

Proposition 5.8. If $ (eventually) scales like the energy to the power n and Sfr 

l 

2 



solves (30) then ^ (eventually) scales like the energy to the power k - 1 



Proof. Since H scales like the energy to the first power, Proposition |5.5| implies that 
gj5 and |^ scale like the energy to the power | and 1 — respectively. Similarly if 
we assume that VP (eventually) scales like the energy to the power v, then C^> scales 
like the energy to the power v + \ — ^- . If $ scales like the energy to the power k 
then Proposition |5.5| implies that ^ = $ — A(&) also scales to the power n. Hence 

if = $ — A(<&), we conclude that k = v + i — and therefore v = k — I + — 

as claimed. □ 

Remark 5.9. Since we have assumed ct\ > 2, the solution to the Poisson equa- 



tion { 30 1 scales with a lower power than the right-hand side of ( 30 ) . See the 
discussion in Section^for further perspective on the case a\ = 2. 



5.6. Building the Lyapunov function: Proof of Theorem 1 5 . 1 1. Let \ be 

a monotone increasing, C°° function with all of its derivatives bounded such that 
x{v) = f° r V < V* an d x(v) — 1 f° r V — + l where 77* was used in the definition of 
Ho. We define X (Q, P] A) - x(H(Q, P; 1)). Then T(Q, P; A) = P 2 X (Q, P] A) G H 
since T(Q, CP; A) = fP 2 x(Q, P] A) for H > i h + 1. 

By construction T G H and is also smooth and locally bounded; and hence, 
Proposition |5.7| ensures the existence of a smooth locally bounded solution to 



(31) m){Q, P; A) = 7 (T - A(T)(Q, P; A)) 
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As before we define V t = H t + ^ t and observe that 



cr 2 

dV t = - 7 A(T)(H t ; l)dt + — dt + a(p t + ~(q u p t ))dW t 

a* a 2 a 2 * 

-lPtgp{qt,Pt)dt+ YQ]5^(lt,Pt)dt + j(T(q t ,p t ) - p t )dt 

2 

< -jA(H t )H t dt + y dt + E t dt + G t dt + dM t 



where E t = E(q t ,p t ), G t = G(q t ,p t ) and 

a* a 2 a 2 * 

E (Q>P) = -^P-gp(l,P) + Ygp2^ p ) and G W>P) = 7( T (?,P) ~P )■ 
The martingale M t is defined by 

a* 

M t = 



o j (p s + Qp(q s ,p s j)dW s 



To prove the first claim, it is enough to show that for any Sq, Si > , Sq^S t +E t +Gt < 
8\H t + C e for some constant C e . First observe that Gt < for alii > 0, so we need 
only show that 5 ^> t + E t < eH t + C t . Since T(Q, P; A) eventually scales like the 
energy, Proposition |5 . 8| implies that eventually scales like the energy to the power 



| + A- which by assumption is less than one. Proposition 5.5 in turn implies that 



dp'fy eventually scales like the energy to the power A and dp^ eventually scales 
like the energy to the power A- — ~ which is again less than one. Hence, we conclude 
that both and E are dominated by a function which scales like the energy with 
a power strictly less than one. (Notice that both E t and "ft are identically zero if 
Ht < ??*■) Using this scaling, we conclude that <5 \E f t + E t has the desired bound. 



Combining these observations with (29), we see that for any S > there exist a 
positive C so that 

dV t < -7(A» - S)V t dt + Cdt + dM t . 
From the scaling it is clear that for any 6 > there exists a positive C such that 
(1 - 5)H(q,p) -C< V(q,p) < (1 + S)H(q,p) + C 



for all (q,p) € H, which completes the proof of Theorem 5.1 Letting (M) t de- 
note the quadratic variation of the continuous martingale Mt, following arguments 
similar to those just complete, we see that d(M) t < + 5){V t + C)dt. 

6. Existence of a "small set" 

The following Lemma provides the remaining missing ingredient required to 
prove geometric ergodicity. 

Lemma 6.1. For every r\ > 0, there exists a probability measure v supported in H, 
a t > and cq > so that 

inf Vt((q,p),A)>cov(A) 

(q,p):H(q,p)<ri 

for any ici. 
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Proof. Let C denote the generator of the Markov semigroup associated to (22). 
We begin by observing that for any s £ (0, t) the operator d s — C is hypoelliptic. 
(See |MSH02j for the straightforward calculation of Lie-brackets.) Fixing an xq — 
(poi9o)j this implies that the transition V s (xo,dy) to y — (q s ,p s ) has a smooth 
density inside of H. In particular, there exists a p s (xo,y) so that 



V s (x ,B s (x )) = / p s {x ,y)dy 

JB s (x ) 

for a sufficiently small <5-ball around Xq and a sufficiently small s > 0. Furthermore, 
ps{xo, y) is smooth at y £ B$(xo)- By considering the adjoint of C, one can conclude 
that p s (xo,y) is smooth at (x ,y) £ Bs(x ) X ^(xo). (See |Str08] L Since for small 
enough t we are sure that V s (xo, Bg(xo)) > there must exist a, yo £ Bs(xq), a 
c' Q > and a possibly smaller <5 so that 

inf Ps(x,y) > c' Q > 0. 

(i,!/)eBs(io)xBa(!/i)) 

Now one can follow Lemma 3.4 of [MSH02| to construct a control argument 
ensuring that given any open set O C H and H(ry) = : H(q,p) < rj} there 

exists a t > and Cq such that 

inf V t (z,0) > eg 

The argument in |MSH02j assumes that the drift vector field is bounded on compact 
sets. This is still true if we restrict to H(?/) for any finite rj > 0. The uniform lower 
bound is not explicitly mentioned, however one can pick a single tubular neighbor- 
hood size of the needed control and ensure that the control and its derivatives are 
uniformly bounded for all starting and ending points in H(ry). 

Setting t = r + s, defining v as normalized Lebesgue measure on B$(yo) and 
combining the preceding two estimates produces, for any z £ M.(r]) and A c H 



V t (z,A) = / V r (z,dy)V s (y,A) 
Jn 

> [ V r (z,dy)V s (y, AnB s (y )) 

JB s (x ) 



> / / P r (z,dy)P s (y,dz) 

JAnB s (y ) JBs(x ) 

>c^X leb (AnBs{y )) f V r (z,dy) 

>^X leb (AnB 5 (y )) = M A) 
which concludes the proof. □ 



7. Proof of Theorem I3TT 



Theorem 3.1 follows by combining Theorem 5.1 and Lemma 6.1 and invoking 
Theorem 1.2 from HM08]. This result is a repackaging of a well known result of 
Harris. It can be found in many places. Most appropriate for the current discussion 
is the work of Meyn and Tweedie exemplified by MT93, Section 15]. 
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8. Conclusion 



We began by observing that to leading order, the dynamics at high energy fol- 
low the deterministic dynamics given by a modified Hamiltonian perturbed by a 
small noise. To leverage this observation stochastic averaging techniques, built on 
auxiliary Poisson equation methods, were used to construct a Lyapunov function 
sufficient to prove exponential convergence to equilibrium. The central result, given 
in Theorem |3.1 1 covers important singular potentials, including Lennard- Jones type 
potentials, which had not been covered by previous results. Theorem |3.1| has two 
principal remaining deficiencies. First it only applies to two interacting particles in 
isolation. Second, Theorem |3 . 1 1 does not cover the classical case where the confining 
potential U grows quadratically at infinity. 

In principle, the extension to many particles could follow a similar route, since 
when two particles are near each other their principal interaction is with each other 
while other particles are just a small perturbation. However it is possible that the 
orbit over which one must average could also interact with other particles. This 
would make finding closed form representations of the averaging measure difficult 
at best (chaotic orbits are to be expected). Even if in some setting the high en- 
ergy orbits remain of the type considered here, the combinatorics of the possible 
interactions would be complicated. 

In contrast, the extension to potentials with quadratic growth is almost certainly 
within reach. In fact, Figure [3] gives a strong indication how to proceed. Since for 
Q'i = 2 the period of oscillation is not going to zero as the energy of the system 
increases, instantaneous homogenization/averaging of the effect of one orbit is not 
feasible. However, building on an idea from |RBT02] one could consider the average 
of the energy over one period r of the system. First observe that r has a limit r* > 
as the energy goes to oo. Namely one would consider the quantity 



Since at high energy p s will be very close to the deterministic orbit, one can likely 
prove that 



This could then be used to obtain control of the excursions away from the center 
of space. Note that following the above argument will not produce an Lyapunov 
function which infinitesimally decreasing on average as was constructed in this 
note. This argument essentially amortizes the total energy dissipation which occurs 
over a single orbit, smoothing out the times when the infinitesimal rate of energy 
dissipation nears zero. Since we feel that covering the quadratic case is not sufficient 
motivation for the extra complications, we have chosen to use the approach used 
in this note since it leads to a more classical "infinitesimal" Lyapunov function. 
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